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Abstract 

We discuss how to describe the Markov chain underlying a generalized stochastic Petri net using Kro- 
necker operators on smaller matrices. We extend previous approaches by allowing both an extensive 
type of marking-dependent behavior for the transitions and the presence of immediate synchronizations. 
The derivation of the results is thoroughly formalized, including the use of Kronecker operators in the 
treatment of the vanishing markings and the computation of impulse-based reward measures. We use 
our techniques to analyze a model whose solution using conventional methods would fail because of the 
state-space explosion. In the conclusion, we point out ideas to parallelize our approach. 
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1 Introduction 


Generalized stochastic Petri nets (GSPNs) [2, 3, 6] are ideally suited to model a large class of performance 
and reliability problems, but their numerical analysis requires the solution of a very large continuous- 
time Markov chain (CTMC). The size of the transition rate matrix R for the CTMC is the main obstacle, 
since its memory requirements can easily exceed the capacity of today’s (and tomorrow’s) machines even 
when sparse storage techniques are employed. 

A possible approach to this problem is to store R implicitly. Plateau [14] proposed the use of 
Kronecker operators for the description of the transition rate matrix of a structured model composed 
of a set of “synchronized” stochastic automata, a subclass of GSPNs. Buchholz [4] used a similar idea 
for Markovian closed asynchronous queueing networks, and Takahashi [17] used it for open queueing 
networks with communication blocking. Donatelli [10, 11] adapted the approach to GSPNs, defining 
first the “superposed stochastic automata”, then the “superposed GSPNs”. Later, Buchholz [5] applied 
the concept to the special case of marked graphs and Kemper [12] addressed some of the problems in 
[11], extending the applicability of the results. 

These approaches have in common a decomposition of the model into a set of submodels, so that 
the state space of the CTMC underlying the entire model is a subset of the cross-product of the state 
spaces of the CTMCs underlying each submodel. This implies that the transition rate for the entire 
model can be described using Kronecker operators on smaller matrices. 

Focusing on GSPNs, the result of decomposing a GSPN is a set of largely independent sub-GSPNs, 
but some transitions will be shared by multiple sub-GSPNs, to model interactions among them. If a 
transition tj is shared by two sub-GSPNs A\ and A 2 , and tj has input and output places in both of 
them, this models a synchronization. A\ and A 2 “wait for each other” and the event corresponding to 
tj occurs only when they are both “ready”. An alternative case arises when tj has its input places in A\ 
and its output places in A 2 . This describes an asynchronous (and asymmetric) communication, since A 2 
must “wait for permission” from A\. However, if an output place of tj in A 2 has a capacity defined for 
it, A 1 will wait for A 2 as well. One of the contributions of our work is to present a unified framework for 
all these types of interactions. Indeed, we do not assume that the net possesses any particular structure. 

The solution of a decomposed GSPN is then based on the following idea [11], Let rii be the number 
of states in <S^, the state space for the CTMC underlying the z-th sub-GSPN, i = l,...,iV. S' T = 
<S 1 x . . . x S$ D St is the “potential state space” for the model, usually (much) larger than the actual 
state space St, so that a transition rate matrix R' is defined using Kronecker algebra: 

N N 

R' = 0R*'+ £ ®R^, 

i = 1 tj£T* i = 1 

where R* describes the local transitions for the z-th sub-GSPN (including the effect of immediate 
transitions), T* is the set of synchronizing transitions, which must be timed, and the “corrective matrix” 
R* J describes the effect of tj on the z-th sub-GSPN. 

The actual transition rate matrix R can be obtained from R' by eliminating the rows and columns 
corresponding to the unreachable states in S' T \ St , but this requires an additional overhead, since 
the composition of a marking does not directly indicate whether it is reachable or not. Hence, an 
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alternative approach was initially suggested. The power or Jacobi method is used to compute the 
steady-state solution in a vector tz' of size By assigning a nonzero initial probability only to 

markings in St, the solution tz' is guaranteed to be zero for any marking m £ S' T \ St- This simplifies 
the algorithm, since m can now be interpreted as the mixed-base integer index of the corresponding 
entries in R' and tz' , but the memory requirement might be excessive when |<S^| |<Sx|. 

To reduce the impact of unreachable markings, Kemper [12] proposed a technique that only requires 
a probability vector 7Z of size |<Sx| . In the numerical iterations, for each m £ St, each entry n > 0 
is generated (this implies n £ St) and, given n, its index k in 7Z, or its lexicographic position in St, is 
computed in O(log |<Sx|) operations, using a binary search. 

We unify previous work by offering a thorough discussion of the structure of the underlying CTMC, 
including the management of immediate transitions and vanishing markings. Our formalism is more 
general than those assumed in [4, 5, 10, 11, 12], since we allow for marking-dependent arc cardinali- 
ties and rates, subject to certain restrictions, hence our results include those previously mentioned as 
special cases. Then, using an approach based on discrete-time Markov chains (DTMCs), we also re- 
move the main restriction previously imposed on the decomposition of the GSPN: we allow immediate 
synchronizing transitions. Finally, we consider a reward structure defined on the GSPN, and we show 
how to compute the expected reward in steady-state in the Kronecker framework. This is of particular 
importance for impulse rewards associated with immediate transitions, whose firing are only implicitly 
represented in R. 

The paper is structured as follows. Section 2 describes the notation used and recalls the main 
concepts of Kronecker algebra, GSPNs, Markov chains, and rewards. Section 3 presents the expression 
for the transition rate matrix of the CTMC underlying a generic GSPN, provided its transitions satisfy 
certain restrictions on the type of marking-dependency. The result is quite general, but not directly 
applicable, since it requires one to compute the inverse of a matrix described as the sum of Kronecker 
products. However, Sections 4 and 5 use it to derive computationally effective expressions for GSPNs 
with timed and immediate synchronizing transitions, respectively. Implementation and application of 
these results are shown in Section 6, including detailed information about computation time and memory 
requirements. Section 7 contains a summary and discusses future extensions, including distributed 
implementations and approximate solutions. 

2 Notation and definitions 

Except for IN, the sets of natural numbers, {0,1,2...}, and 1R , the set of real numbers, all sets are 
denoted by upper case calligraphic letters (e.g., A); vectors and matrices are denoted by lower and 
upper case bold letters, respectively (e.g., a, A); their entries are denoted by subscripts (e.g., a*,, A &,/); 
a set of indices can be used instead of a single index, for example, A x,y denotes the submatrix of 
A corresponding to set of rows X and the set of columns y. Superscripts denote families of related 
quantities (e.g., A 1 , A 2 ). Q xXy and \ xXy denote matrices with x rows and y columns, having all entries 
equal to 0 or 1, respectively, while 1^ denotes the identity matrix of size x X x] the dimensions of these 
matrices are omitted when they are clear from the context. Given a vector a, diag(a) is a square matrix 
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having a on the diagonal and zeros elsewhere. Given an n X n matrix A, rowsum(A) = diag(A • l„ x i) 
is a matrix having the diagonal equal to the sums of the entries on each row of A, and zeros elsewhere, 
while h(A) is a matrix having the same nonzero pattern as A, but with entries equal to either 0 or 1. 


2.1 Kronecker algebra 

We recall the definition of the Kronecker product B = 0^ =1 A k of K matrices A k £ ]ft n k Xm k^ us i n g the 
convention that the rows and columns of both B and the matrices A k are indexed starting at 0. The 
generic element of B £ JRY[k=i nkX Yik=i mk is 


B 


(•••((£ )«' 2+£)«'3 —)n k +ik,(---((h)rn2+32)rno,---)m k +jk 


A 1 A 2 

'lJl *2j2 


•••A 


K 

* K, 3K 


with 0 < ik < rik and 0 < jk < rrik, for k = 1 , . . . , Ah Assuming a mixed-base numbering scheme 
so that the tuple (h, / 2 , . . . Ik) corresponds to row (,..((/i)n 2 + / 2 )n 3 • • -)nk + h or column (,..((/i)m 2 + 
/ 2 )m 3 • • respectively, we will also write the above quantity, more succinctly, as B 

The Kronecker sum ®^ =1 A k of K square matrices A k £ ]R n k Xn k dehned as 

K I< 

0 A k = Y, Ini ® ® b,.! <g> A k ® I nk+1 •••(£) I nK . 

k = 1 k = 1 


2.2 Generalized stochastic Petri nets 

A generalized stochastic Petri net (GSPN) is a tuple ("P, T,X, C“, C + , m°, w), where: 

• V = {pi, . . . ,P|n|} is a finite set of places , drawn as circles in the graphical representation of the 
GSPN. A non-negative integer vector m £ called marking describes the number of tokens in 
each place. Given a place pi £ V, rrq is the number of tokens in pi for marking m. 

• T = {G, . . . ,t\r\} is a finite set of transitions , V fl T = 0. 

» I C T is the subset of immediate transitions, drawn as thin bars, while X = T \ X are the timed 

transitions , drawn as rectangles. The firing time of immediate transitions is zero, while that of 
timed transitions is exponentially distributed. 

• C“ and C+ are incidence matrices of size \V\ X \T\. Their elements are functions from iVX to 
IV. C,~-(m) and Ch(m) denote the marking-dependent integer cardinality assigned to the input 
arc from pi to tj and the output arc from tj to pi respectively. In the graph, these arcs are drawn 
using an arrowhead pointing to the destination if their cardinality is not identically equal to zero. 
The cardinality function is indicated on the arc unless it is identically equal to one. 

• m° is the initial marking. In the graph, the value of m° is written inside place py if positive. 

• For any t 3 £ T, Wj is a function from to 1R. Wj(m) is the weight associated with transition 

tj in marking m. According to whether tj is immediate or timed, this weight represents an 

(unnormalized) firing probability , or a firing rate. 
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A transition tj G T has concession in marking m iff 

\/p t G V, C^j(m) < m 8 , or C^ J (m) < m. 

If any immediate transition has concession in m, it is enabled , and m is said to be vanishing. Otherwise, 
m is said to be tangible and any timed transition tj with concession is enabled in m. In other words, a 
timed transition is not enabled in a vanishing marking even if it has concession. 

Some definitions of GSPNs allow one to disable a transition tj with concession in m by specifying a 
zero weight Wj(m) for it, or by introducing inhibitor arcs, drawn with a circle instead of an arrowhead. 
In a marking m, an inhibitor arc from place pi to transition tj with cardinality c(m) disables tj if m 8 > 
c(m). Since these behaviors can be represented by an appropriate definition of input arc cardinalities 
in our formalism, we assume, without loss of generality, that Wj(m) > 0 if tj is enabled in m, and we 
use inhibitor arcs in our models merely as a shorthand. 

Let £(m) denote the set of enabled transitions in marking m. An enabled transition tj firing in 
marking m yields a new marking n such that 

\/p t G V, n 8 = m, - CW(m) + C+-(m) = m, + C 8J (m) (or n = m + Cpj(m)), 

where C = C+ - C" is the incidence matrix of the GSPN. We can also write m-bn to express that tj 
has concession in m and that n is obtained from m by firing tj, regardless of whether tj G £(m) or not 
(tj is not enabled if it is timed and m is vanishing, or if Wj(m) = 0). 

The firing probability of a transition tj enabled in marking m is 

W i( m ) 

E<£( m) W;(m) ' 

If m is tangible, this corresponds to a race between the exponentially distributed firing times of the 
enabled transitions, with rates given by w. In a vanishing marking, instead, weights define a probabilistic 
choice, since the race model does not specify how to choose which transition to fire next when multiple 
enabled transitions have the same zero firing time. 

2.3 Reachability set 

The reachability set S is defined as the set of markings reachable from the initial marking m° by firing 
any sequence of enabled transitions. Formally, S is the smallest subset of containing m° and such 
that m G S, tj G £(m), and m-bn imply n G S. Fig. 1 shows the skeleton of an algorithm to build the 
set of reachable markings S (which we assume finite from now on). Particular care must be placed on 
the implementation of statement 9, since the size of the set S to be searched is very large in practice. 
Efficient methods include hashing or balanced search trees (e.g., AVL trees [18]). While not explicitly 
stated in the algorithm, S should be stored as the union of two disjoint sets, St and <Sy, corresponding 
to the tangible and vanishing markings, respectively. 

A function T assigns an index to each reachable marking, according to a lexicographic order, indi- 
cated by ‘V”: 

T : S — >■ {0, . . . , |<S| — 1} such that 'F(m) > 'F(n) m >- n. 
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Algorithm BuildRS (input: ("P, T, Z, C ,C + ,m°,w); output: <S); 

1. S <— {m 0 }; /* S contains the markings found so far */ 

2. U <- {m 0 }; /*U C iS contains the found but unexplored markings */ 

3. while U ^ 0 do 

4. “choose a marking m from W \ 

5. IA < — IA \ {m}; 

6. “compute £(m)”; 

7. for each j £ £(m) do 

8. n <— m + C-pj(m); 

9. if n S then 

10. U^IA U{n}; 

11. S < — S U {n}; 

12. end if 

13. end for 

14. end while 


Figure 1: Algorithm BuildRS 

If an AVL tree is used, T can be precomputed with a simple preorder visit of the tree, and its value can 
be stored in the nodes of the tree. Then, given m £ the value k = 'F(m) can be found in O(log |<S|) 

operations using the AVL tree augmented with this additional information ('F(m) = “undefined” for 
any m ^ S). The restrictions of T to the tangible and vanishing markings, can be defined accordingly: 
Ty : St {0, . . . , \St\ — 1} and Ty : Sy —■ ” {0, . . . , |<Sy | — 1}. 

In the following, with a slight overloading in the notation, we use a marking m to index data 
structures (vectors and matrices) referring to <S, St, or Sy- Strictly speaking, we should use instead 
'F(m), 'Fy(m), and fy(m), respectively, but this would make the expressions excessively cumbersome. 
Nevertheless, it is important to stress this fundamental difference from a computational point of view; 
finding the index of a marking is a potential source of additional complexity in any structured approach. 

2.4 Underlying continuous-time Markov chain and rewards 

We focus on the steady-state analysis of the continuous-time Markov chain (CTMC) underlying a GSPN, 
described by the infinitesimal generator matrix Q £ I x |<s T | ^ w } 1 i c ] 1 we assume ergodic (and finite, 
since S is finite): 

Q = R — A = Ry,x + Rx,v(I|s y | — U^y^Uvyr — A, (2) 

where R is the transition rate matrix, A = rowsum(R), and Rx,x and Rx,v (Uyj and U yy) describe 
the rates (probabilities) of going from tangible (vanishing) markings to tangible or vanishing markings, 
respectively. The entry of Rx,x (Rjy) corresponding to the row for m and the column for n describes 
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the rate from m £ Sj to n £ Sj (n £ <Sy): 


E w j( m )- 

tj 

tj££(m),m — m 

The entries of Uv,v and U^j are defined analogously, using the bring probability of immediate transition 
tj, given by (1), instead of the weight Wj(m). For a discussion of how to generate Q in practice, see 

[2, 6]. 

We observe that A mjin = Et.,e£(m) w j( m ) i s then the total rate leaving marking m £ St, and it 
equals the inverse of h m , the expected holding time in m. 

Let 7 r m be the steady-state probability of a tangible marking m (vanishing markings have zero 
probability). Then, the steady-state probability (row) vector 7T £ ]R) St \ satisfies the balance equation 


7 r 


Q = Oi x |s T | subject to the normalization 7 r • 1 |«s T | x i. = 1- 


( 3 ) 


We can specify a quantity of interest for the GSPN using a reward structure (p, r), where p(m) is 
the reward rate gained while the GSPN is in marking m, and rj(m) is the reward impulse gained when 
transition tj £ T fires in marking m. The expected reward rate in steady state is then 


E ^mKm) + E E $ 7,m rj(m), 

meS T me5tje£(m) 


( 4 ) 


where $j )in is the rate at which transition tj fires in steady state in marking m. If we let (j) £ be 
the vector describing the rate at which each marking is entered in steady state, $j )in is obtained as 

^j,m = 4*i 


Eq E£( III) W;(m) 

For m £ S T , </> m = Eqe£(m) w;(m) = 7r m A miin . For m £ S v , instead, 

= 'y . ‘ F n nl , 

nG*S T 

where, for n £ St and m £ Sy, the corresponding entry of matrix 

F = Rt,i-(P,,| - Vv.v)-' 6 


( 5 ) 


describes the rate at which a vanishing marking m is entered after leaving a tangible marking n and 
before reaching the next tangible marking. If no reward impulse is defined for immediate transitions, 
then Eq. (4) reduces to 

E 7 r m P(m)+ Wj(m)rj(m) 

me<S T \ qe£(m) 

In this work, we consider the structure of both Q and F, and the computation of 7 r and (j>. 
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3 Kronecker expression for the CTMC underlying a GSPN 

We now show how the ideas in [4, 5, 10, 11, 12] can be applied to individual places, not just to sub- 
GSPNs. Our goal is to clarify the relationship between Kronecker algebra and GSPNs, while relaxing 
several important restrictions on the type of interactions. Later we will merge individual places into 
“macroplaces”, which corresponds to the notion of sub-GSPNs. 


3.1 Using the state spaces of individual places 

The first extension regards the type of marking-dependency allowed in the GSPN. We allow the weight 
of a transition to be expressed as the product of “local effects” due to the number of tokens in each 
place: 

Vm G S,Vtj G T, < m =v Wj(m) = itW f[ (6) 

p t ev 

where w* can be interpreted as a constant “reference” weight, while the values Wij are dimensionless 
scaling functions [1], This “independence of effects” in the marking dependence implies, for example, 
that if markings m and n differ only in the number of tokens in py and if tj is enabled in both, 
Wj(n) = Wj(m) • Wij(ni ) / If a weight Wj does not depend on the number of tokens in pq we 
assume, without loss of generality, that Wij is identically equal to one. Note that we do not require the 
weight Wj(m) of a timed transition tj in a vanishing marking m to be zero. Doing so would make the 
specification of w for a given GSPN more difficult in practice, and is not required by our approach. 
Analogously, the dependence of the matrices C“ and C + , hence C, is assumed to be of the form 

Vm G S,Vpi G V,Vtj G T, C*j(m) = (7) 

where “*” is one of “+”, or nothing, and /?*• is a function from IN to IN. 

We can now state a theorem expressing the matrices Q and F of the CTMC underlying a GSPN in 
terms of smaller matrices related to each place-transition pair. 


Theorem 3.1 Consider a GSPN with finite reachability set S satisfying Eq. (6) and (7), 
and let — 1 be the bound of place p 4 - G V, that is, for any m G <S, rrq G {0, 1, . . . — 1} = S\ 

Define 

R/ = E w *j • 0 WM u ' = E w *j • 0 W E (8) 

Pi tj E2" pi 

where W' ,J is a square matrix of size rii X rii whose entry in position (r, c), for r, c G S\ is 
given by 


W' J (r, c) 


Wij(r) if r > Pi j(r) and c = r + fti,j(r) 
0 otherwise 


Also, define 

A' = rowsum(R / ) = E w ] ' (££) rowsum(W^) = E w ] ' (^) 

Pi Pi 


( 9 ) 
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r ' = rowsum(U / ) = ^ w* ■ (££) rowsum(W M ) = ^ w* ■ (^) i _UJ (10) 

tj G2" p%^.P v%^lP 

and T' = I|£/| — S(r'). Then, the matrix 

Q' = R' • (i - (T' + T')- 1 • U') _1 - A' (11) 

satishes Q = Qs s and F = Q s T s v i where Q and F have the meaning defined in Eq. (2) 
and (5). 

Proof: Matrices with superscript have row and column set S' = jo, 1, . . . (n Pt eV n i) ~ 1 } 5 
or iS 1 X • • • X <S^, if we identify a tuple with its mixed-base value. In the following, however, 
we partition matrices and permute their rows and columns so that the markings appear in 
lexicographic order within the sets St, Sy, and S'\S. This is for illustration purposes only. 

First, we prove that 


Rm,n = £ 


w 


J 


W 




Pi€.P 


= I>MI w« = £ 


( 12 ) 


111. II 


Pl€.p 


tj 


Let’s consider the contribution to this value for each timed tj £ X , by doing a case analysis: 

1. If m-bn, the contribution should be Wj(m). Indeed, for all pi £ V, W^( nj = jo,-j(m,-), 
hence the contribution of A is 


w j ■ II = w A m )- 

P,ev 


2. If tj does not have concession in m the contribution should be zero. Indeed, there must 
exist a place pi such that m 8 < C^m) = /3y( m i)- This implies nj = 0, and the 
contribution of t, is w* ■ T c pW !h „ = 0. 

J 3 A x Pi t r a- 11 * 

3. If m-bn' ^ n, the contribution of tj should be zero as well. Indeed, there must exist 
a place pi such that n 8 ^ m 8 + C 8J (m) = m 8 + /3 8J (m 8 ). Hence, W^. n . = 0, and 
w* ■Yl Pt er^ m „n, = 0 . 


Thus, the contribution of each transition in the summation is correct. An analogous argu- 
ment allows us to show that 


U 


/ 

m,n 


b 


(13) 


From Eq. (12) and (13), we can conclude that U^ t s , = 0, s ,^ s = = 0, and 

that the matrices Rx,x, Rxy, U v,t, and Uv,v for the underlying GSPN, with their rows 
and columns ordered according to T, can be expressed as: 


Rx,x — R/< 


S j 1 ,Sj 


Rxy = R' 


'St ,S\, 


U y,T ~ 


r 


i-i 

Sy ,Sy 


•U 


/ 

Sy ,Sjp 


Uv,v = 


r 


i-i 

Sy ,Sy 


•U 


/ 

Sy ,Sy 


(14) 
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(the normalization r'sy s v i s required because the weights of the immediate transitions 
enabled in a vanishing marking are not required to sum to one, while the entries in U v,t 
and Uv,v are probabilities). We can conclude A = A' St St as well. 

Hence, letting denote submatrices whose value is irrelevant, 



Substituting this value in the definition of Q' given in Eq. (11) completes the proof: 



If the number of tokens in p; is always at least to; > 0, we can, of course, define S l = {to;, . . . , n; — 1} 
or, equivalently, change the definition of the GSPN so that the range of tokens in p; becomes S l = 
{0, . . . , n; — to; — 1}. This would not affect the proofs in this paper, but could improve the efficiency of 
the implementation. 
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Figure 2: A case where m-bn, n £ S, but m S. 

It is important to stress that R^/ys $/ and U^,^ 5i5 , cannot be guaranteed to be zero (nor can R^, 5 ,, 
but this is because we allow timed transitions to have concession in vanishing markings). We now 
formalize this observation, already implicit in previous works using the Kronecker approach, since it is 
fundamental for a better understanding of the nature of the matrices R' and U 7 . 

Lemma 3.1 The matrices R' and LJ 7 defined in Theorem 3.1 satisfy the following “forward 
reachability condition” : 

m £ St A Rj n n >0 =v n £ S and m £ Sy A U( n n >0 =v n £ 5 (15) 

However, the analogous “backward reachability condition” does not hold: 
n £ S A R' n n >0 yb m £ Sr and n £ S A U' n n >0 yb m £ Sy (16) 

Proof: It is straightforward to show that Ecp (15) holds when the premises of Theorem 3.1 
are satisfied. We simply need to observe that, given the definition of R', R' n n > 0 and 
m £ St imply that there is a transition tj £ £(m) and that its firing leads to n; thus, n is 
reachable. An analogous reasoning holds for U 7 and m £ Sy. To show that Ecp (16) holds, it 
is sufficient to give an example; we do so for Rb Consider the GSPN in Fig. 2, having positive 
finite transition rates. The firing of two transitions f 2 , could lead to n = (0, 1, 1): by f 2 , 
from marking (1, 0, 0), and by H, from marking (1, 0, 1). In our notation, ??i = n - 2 = n§ = 2, 



Thus, the contribution of t\ to R / 101 011 is ■ W| q • Wq’| • > 0, hence R / 101 011 > 0. 

However, given the initial marking, m° = (1, 0, 0), the marking m = (1, 0, 1) is not reachable. 

□ 

Theorem 3.1 gives a characterization of the infinitesimal generator Q and of the matrix F of a GSPN 
by focusing on the effect of each transition on each place. An alternative statement of this theorem, is, 
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Figure 3: A GSPN where \St\ AC |<S'|, and one where |<S| = |<S' 


of course, 


Q 

F 


R 


/ 

S ,<Sj> 


+ R' 


St ,S v 



r 


i-i 

Sy ,Sy 


• U 


/ 

Sy ,Sy 



/ 

Sy ,St 



r 


i-i 

Sy ,Sy 


• U 


/ 

Sy ,Sy 


— 1 


A 


/ 

St ,St 


In either form, however, this result has little practical value in itself, since both expressions contain 
an inverse which cannot be expressed using Kronecker operators on smaller matrices. One case where 
Theorem 3.1 has a direct application is when there are no immediate transitions. Then, St = <S, Sy = 0 , 
and Ecp (2) simplifies to Q = R — A = R t,t — A = R^ t St — A' St s . 

However, a solution approach based on this idea alone has limitations, due to the restrictions that 
the GSPN must satisfy. Even more importantly, though, the size of Q' is enormous, potentially leading 
to inefficiencies. Consider for example the GSPN in Figure 3(a), having positive finite transitions rates. 
If the initial marking contains a total of n tokens, 


\S\ 


n + k — 1 \ 

n J 


< \S'\ = (n + l) fc . 


From the simple cast when n = 1, it is apparent that l lie dilfereiice. k vs. 2 fc , can be enormous. For 
this type of closed networks, Buchholz [4] suggested a solution method based on Kronecker algebra 
that does not create unreachable states, applicable when the interaction between submodels is of the 
asynchronous type described in the introduction. 

On the other hand, it is possible for S to equal S 1 . This happens, for example, in a live free- 
choice GSPN with capacities whose undirected graph obtained by ignoring arc directions is acyclic 
(this is a generalization of [13, Property 3], which refers, however, to unbounded marked graphs). 
Another example is that of open acyclic queueing networks with communication blocking due to bounded 
buffers [17] which could be named “open state machines with capacities” in Petri net terminology. The 
transitions in these nets have at most one input and one output place and, if capacities were removed, 
every place would become unbounded. See the GSPN in Figure 3(b) for a simple example of a tandem 
network. Indeed, when S' = St , is simply the mixed-base value of m, hence T^ 1 )^) does not 

have to be stored explicitly. Unfortunately, such a situation is rare. 
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3.2 Merging places into macroplaces 

The type of marking dependence expressed by Eq. ( 6 ) and (7) is quite general, but for example, it does 
not let us specify a firing rate proportional to a nonlinear function of several places (e.g., min{mi, m 2 }). 
We now show how this limitation can be overcome in practice by merging places (pi and p 2 , in our 
example). 

Consider a GSPN A = ("P, T,X, C“, C + , m°, w) with finite reachability set S, and partition V into 
V = {V\, • • .VifA, where Vi = {pi 1 , ■ ■ - Pi . }. Then, define an order-preserving bijection 7 : S — > S C 

lN\ f \ 

7 (mi, . . . ,m|p|) = (mi, • • • ,m^|) satisfying 7 ( 111 ) 7 ( 11 ) <*=> m n, (17) 

where rrq is the position, in lexicographic order, of (m 8l , . . . , ^) in the set obtained by projecting S 

over Vi . 

Lemma 3.2 Given A, V, and 7 defined as above, consider the “compacted” GSPN 

A = (V,T = T,1 = J,C-,C+,m° = 7 (m°),w), 

where the input and output arc cardinalities are defined to ensure that, in corresponding 
markings m and 7 (m) = m, tj £ T has concession in A iff it has concession in A and that, 
in this case, m-bn = 7 ( 11 ) in A iff m-bn in A , while the weights for tj are defined to have 
the same value in corresponding markings: 

• If tj £ £(m) and its firing does not change the marking of any place in Vi, that is, if 
Vp/ £ Vi : C;“'(m) < rrq A C,“-(m) = Cjj(m), dehne C^(m) = Cjj(m) = 0. 

• If tj £ P(m) and its firing changes the marking of some place(s) in Vi, that is, if 
\/pi £ Vi : C,“-(m) < m ; A 3 pi £ p t , C,“ (m) ^ Cjb(m), dehne C 8 “'(m) = m 8 and 

C+-(m) = n,-. 

• Otherwise, tj is disabled in m, that is, 3 pi £ Vi, Cj”-(m) > m;; then, dehne Cj"j(m) = 
m 8 ' + 1, while the value of Cjj(m) is irrelevant. 

• Dehne w 3 (m) = Wj(m). 

Then, the transition rate matrices R and R, dehned by A and A, respectively, are identical. 

Proof: Omitted for brevity (it is sufficient to show that the stochastic processes described 
by A and A are identical). □ 

Lemma 3.2 allows us to compact an arbitrary set of places into a single place, which, together 
with the transitions connected to it, corresponds to a sub-GSPN of [11, 12]. This operation must be 
performed when the marking dependencies in the GSPN are not of the type allowed by Theorem 3.1. 
It might be performed, even when the theorem is applicable, to reduce the number of matrices involved 
in the description of R at the cost of increasing their size. 

From now on, macroplaces are indicated as dashed boxes surrounding sets of places; the compacted 
GSPNs are not shown explicitly, since they would not add to the comprehension of the model. 
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4 Timed synchronizing transitions 

The contributions in [4, 5, 11, 12] have assumed that the GSPN is decomposed in such a way that each 
iteration of the solution method performs Kronecker products of few large (but manageable) matrices, 
while Theorem 3.1 uses many (|T| • \V\) small (n 8 - X rii) matrices. 

Lemma 3.2 addresses the size issue: we can merge places, thus obtaining \T\ ■ \ V\ larger matrices. In 
this section, we show how the number of matrices involved can be further reduced by merging transitions, 
or rather, the corresponding matrices. The results are similar to those derived by previous authors, who 
assumed all synchronizing transitions are timed, but we present them here for three reasons. First, 
we exhibit substantially different proofs for these results; [4, 5, 11, 12] consider a set of sub-GSPNs 
and combine them using synchronizing transitions, thus Kronecker operators are introduced only at 
the last step. Instead, we start from the Kronecker expression of Theorem 3.1 for the entire GSPN 
and derive our results by exploiting the properties of Kronecker operators. Second, our results include 
the management of immediate transitions and vanishing markings, while previous works have simply 
assumed that these are eliminated locally using the traditional approach. Finally and most importantly, 
our result apply to a larger class of GSPNs, since a more general marking-dependent behavior is allowed 
by Theorem 3.1. 

4.1 Partitioning the set of transitions 

Without loss of generality, we assume from now on that each transition in X has at least one input or 
one output place: Vtj £ T, 3 pi £ V, C~ 3 ^ 0 V Cfj ^ 0. Then, let T* C T be the set of “local” 
transitions which affect, or are affected by, only a single place pp. 

\/p t £ V : X® = \tj £ X | £V,pi ^ pi, C,“ = C]h = 0 A w itj = l} , (18) 

and X* = X \ (J Vl ^p X® be the set of synchronizing transitions which instead affect or are affected by at 
least two places. Clearly, these sets constitute a partition of X. Also, let X * = X* fl X , X* = X* fl X, 
X 1 = X® OX, and X® = X® D X. 

Lemma 4.1 Consider a GSPN satisfying the requirements of Theorem 3.1. Then, 


fs 

II 

M 

(g) w®'- j 

+ ® R‘ , 

u ' = E A 

• (g) W®’ 

’ + © u‘. 

(19) 

tjEX* 

p t ev 

Pi£V 

t,eT* 

p t ev 

Pi£V 


II 

M 

• (g) 

+ ® x . 

II 

M 

■ (g) r* 

+ ©r\ 

(20) 

tjEX* 

p t ev 

Pi£V 

be i' 

p t ev 

Pi£V 



where R®, IP, A\ and X® are square matrices of size rii X rii defined as 

r® = y, w r W ® J , u® = Y w r W ® J , a* = Y , r®= Y w r ri,j - 

tjEX* qeA tjEX* qeA 

Hence, as special cases, R® = A 1 = 0 if X 1 = 0 and U® = X® = 0 if X® = 0. 
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Proof: We only prove the result for R/, the proof for the other matrices is analogous. Given 
the condition specified by Eq. (18), we know that W /j = \ ni if pi ^ pi and tj £ T 8 . Then, 
the proof is a simple matter of matrix manipulation within the Kronecker expressions: 


II 

M 

0 

0 W 8 ’ J 





t 3 eX 


p t ev 





= £ 

* 

W J 

• 0 w 8 '- j 

; + 

£ £ y 

• 0 w ij 


t 3 EX' 


p t ev 


Ptev tjEX* 



= £ 

* 

W J 

• 0 w 8 '- j 

; + 

£ £ y 


® ® w 8j ® I nt+1 (£)•••(£) I np| 

t 3 EX' 


p t ev 


Ptev tjEX* 


/ \ 

= £ 

* 

W J 

• 0 W 8 '- J 

; + 

£ l ni ® • • 


I £ w* ■ W 8 J | ®I nt+1 

tjEX* 


p t ev 


P,ev 


\tjEX* ) 

= £ 

* 

W J 

• 0 W 8 '- J 

; + 

£ l ni ® • • 

® I n t — i (8) 

Rf ® I n, + 1 G---G I n m 

tjEX* 


p t ev 


P,ev 



= £ 

* 

W J 

• 0 w 8 '- j 

; + 

©R‘- 


□ 

t 3 EX' 


p t ev 


Pi£V 




This partition reduces the number of Kronecker product terms from |A| to |A*| in R' and from \X\ to 
|J*| in U', respectively, and adds one Kronecker sum to both. Transitions satisfying Eq. (18) arise after 
applying Lemma 3.2, that is, after “decomposing” a large GSPN into several smaller sub-GSPNs. Each 
sub-GSPN corresponds, in our terminology, to a (macro)place, plus the set of transitions local to it. 
This transformation does not have to be explicitly performed in practice, only its result, the matrices 
corresponding to the set of macroplaces, need to be computed. A good partition of the places results 
in a compacted GSPN where most transitions are local and the number of tokens in each compacted 
place (number of markings in the sub-GSPN, in the terminology of [11, 12]), is manageable. Methods 
to determine a good partition are beyond the scope of this paper and are left for future research. 

4.2 An efficient Kronecker expression for the CTMC 

Given any GSPN, we can always apply Lemma 3.2, resulting in a compacted GSPN satisfying the 
requirements of Theorem 3.1, then apply Lemma 4.1. If the partition is such that all immediate 
transitions are local, we can then restate the main results of [11, 12] in a more general setting. 

Theorem 4.1 Consider a GSPN satisfying the same requirements as for Theorem 3.1, and 
such that all immediate transitions are local: 

J* = 0 =>■ U' = 0 U\ T* = A*. (21) 

p t ev 


Then 


Q" = (w 8 '- j • X 8 ) + 0 (r 8 • X 8 ) - A', 

tj ex- Pt ev K J Pt ev K J 
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where 


X 8 = (0, - (T 8 + r’') _1 U’') 1 r = rowsum(lP) T 8 = I ni - 6{r ), 

satisfies Q s T ,s T — Q’ with the meaning defined in Eq. (2). 

Proof: First, we observe that condition Eq. (21) implies r' = i” 8 and T' = © ) Pl( =.-p T 8 . 

In other words, a “global” marking m is tangible iff all its “local” components are tangible. 
We can then manipulate Q" as follows: 

Q" = (w 8 ’ 2 • X 8 ) + 0 (R 8 • X 8 ) - A' 

tj ex- Pt ev K J Pt ev K J 

= E W J • 0 WM • 0 X 8 + 0 R 8 • 0 X 8 + 0 (r 8 • X 8 ) — 0 R 8 • 0 X 8 -A' 

tjex* Pl ev Pl ev Pl ev Pl ev Pl ev Pl ev Pl ev 

' v ' 

D' 


= E^'0 w ’’ ,j + 0 R‘ • 0 X 8 + D' — A' 

\t 3 ex* Pl ev Pl ev / Pl ev 

= R' • 0 X 8 + D' - A' 

p t ev 


Partition <S 8 into Sj and Sy, corresponding to local markings enabling only timed local 
transitions, or some immediate local transition, respectively, and rearrange the rows and 
columns of X 8 accordingly: 


X 8 


(T 8 + r 8 )- 1 ^' 


Si 
1— 1 

L°_l 

pi 

. r V,T 

— i 
£ 


where the subscripts “T, T”, “V, T”, and “V, V ” have the usual meaning, but applied to the 
local matrix for place i. N 8 y ,V = (r 5vi5v - (n v ^v) _1 • gy'j describes the expected 
number of visits to each local vanishing marking before reaching a local tangible marking, 
starting from each each local vanishing marking, while P l VT = s ■ (r t Sy 5 V ) _1 • St 
describes the probability of reaching each local tangible marking, starting from each local 
vanishing marking. 

We continue assuming that \V\ = 2, the general proof follows exactly the same idea. Local 
matrices are partitioned according to whether the corresponding local markings are tangi- 
ble or vanishing (regardless of whether the global markings are reachable or not). Global 
matrices are partitioned according to the following order: tangible states, vanishing states 
enabling only immediate transitions in T 2 , vanishing states enabling only immediate transi- 
tions in T 1 , and vanishing states enabling immediate transitions in both T 1 and T 2 . First, 
we show that the tangible rows of D' are zero: 


D' = (R x X x © R 2 X 2 ) — (R 1 © R 2 )(X x 0 X 2 ) 

= R x X x (8) I n2 + I ni (g> R 2 X 2 - R x X x (g> X 2 - X 1 (g> R 2 X 2 
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Figure 4: A GSPN to illustrate the difference between Q' and Q". 


= RdX 1 0 (I „ 2 - X 2 ) + (I ni - X 1 ) 0 R 2 X 2 



since X* and \ ni have the same tangible rows. Then, we show that the tangible columns of 
and (I — (T' + i n/ ) _ 1 U / ) 1 coincide: 



The blocks in the first (tangible) column of this last matrix contain the correct values, 
since the top left block is simply the identity and the other blocks correctly describe the 
probabilities of reaching tangible markings from vanishing markings, which are the values 
in the corresponding blocks of (I — (T' + i n/ ) _ 1 U / ) 1 . From Ecp (22) and (23) we can then 
conclude that Q s T ,s T — Q , as in the proof of Theorem 3.1. □. 


In practice, Theorem 4.1 is used to generate only the relevant portion of Q" in the numerical solution 
method. In other words, we eliminate the vanishing markings “on the fly” (as in [4, 5, 10, 11, 12]) 


Q = 


= E 


W !J • X *) TT + 0 (R*‘-X' 

’ P,ev 
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We observe that both Q' and Q" describe Q, but the two normally differ. For example, consider the 
GSPN in Fig. 4, having a single tangible state and three vanishing states (the example is trivial, but it 
is sufficient to illustrate the point). Assuming that the rates of timed transitions A, t 3 , and t 5 are a, 6, 
and c, respectively, and that the weights of transitions t 2 and t 4 are 2 and 3: 



0 a 


' 0 

b ' 


' 0 

1 ■ 

w 2 ’ 5 = 

' 0 

1 ■ 


' 0 

0 ' 


' 0 

0 ' 

R 1 = 

R 2 = 


w 1 ’ 5 = 

u 1 = 

u 2 = 


0 0 


0 

0 


0 

0 


0 

0 


2 

0 


3 

0 


From which we obtain 


' 0 

0 ' 


' 0 

0 ' 


'10' 


' 1 0 ' 


' 1 0 ' 


' 1 0 ' 

w 2 = 

T 1 = 

f j 1 2 

X 1 = 

X 2 = 

_ 0 

2 _ 


_ 0 

3 


_ 0 0 _ 


_ 0 0 _ 


_ 1 1 _ 


_ 1 1 _ 


and 


' 0 

b 

a 

c 


' 1 

0 

0 

0 

0 

0 

0 

a 

T' = 

0 

0 

0 

0 

0 

0 

0 

b 


0 

0 

0 

0 

. 0 

0 

0 

0 _ 


. 0 

0 

0 

0 


U' 


0 0 0 0 
3 0 0 0 
2 0 0 0 
0 2 3 0 


0 0 0 0 
0 3 0 0 
0 0 2 0 
0 0 0 5 


The resulting Q' and Q" matrices are then different: 


a + b + c 

b+lc 

a + |c 

c 



a + b + c 

b + c 

a + c 

c 

a 

| a 

| a 

a 

- A' 

Q " = 

0 

a 

0 

a 

b 

5 

¥ 

5 

¥ 

b 

0 

0 

b 

b 

0 

0 

0 

0 _ 



0 

0 

0 

0 . 


Indeed, the difference between Q' and Q" is already apparent from Eq. (23). The diagonal blocks 
I^ T (f) Nyy and Nyy ®Ij T correctly describe the expected number of times the corresponding global 
vanishing markings (enabling only local immediate transitions in T 2 or T 1 , respectively) are entered, 
given that a timed transition firing leads to the corresponding diagonal block. However, the last three 
blocks on the bottom row do not reflect the same quantities when a timed transition firing leads to 
vanishing markings enabling immediate transitions in both T 1 and T 2 . In particular, Py T <8> N 2 
describes the correct quantity only if we could assume that all enabled immediate transitions in T 1 keep 
firing before any of those in T 2 do, which is not necessarily the case, while Nyy ® Pyx assumes the 
opposite. Finally, Nyy ® N^j, does not reflect the number of times global markings are entered at all. 
This leads us to the following observation. 


Corollary 4.1 If the GPSN of Theorem 4.1 is such that the firing of any timed (synchro- 
nizing) transition tj £ X* enables (local) immediate transitions in at most one set T\ for 
some pi £ V, then Q^ t s = F, as defined in Eq. (5). 

Proof: The condition for this corollary implies 

5yC U X • • • X sif 1 X S l y X S^ 1 X • • • X S l T l . 
p t ev 
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As discussed in Theorem 4.1, the blocks on the columns corresponding to this type of vanish- 
ing markings are computed correctly in 0 X®, see Eq. (23). The bottom rows of Eq. 23, 

corresponding to any (unreachable) vanishing markings m enabling immediate transitions 
in more than one set T®, are irrelevant, since Lemma 3.1 guarantees that R^ t m = 0 in this 
case. □ 


5 Immediate synchronizing transitions 


If some of the synchronizing transitions are immediate, Theorem 3.1 still applies, but Theorem 4.1, 
which allows the efficient computation of the solution in practice, does not. In this section, we show 
how “preservation of the vanishing markings” [ 8 ] can be used to remove this limitation, also present in 
[ 11 , 12 ]. 

5.1 Embedding a DTMC 

First, we summarize the main ideas in [ 8 ], which examines an alternate method to compute 7 r: 

• Define the transition probability matrix P of the embedded DTMC, expressing the probability of 
going, in one firing, from any marking m £ S to any other marking n £ <S, regardless of whether 
they are tangible or vanishing: 


(24) 


Si 

V 

1 

1 

A 1 Rx,v ] 

Si 

7 

1 

1 

7 


• Compute the steady-state probability vector 7 £ 1R} S \ of the embedded DTMC: 

7 • P = 7 subject to the normalization 7 • l|s| x i = 1- 

• Obtain both 7T and </> from 7 , using the holding times in the tangible markings as weights: 

7m ' hm J w ^ c j. Am 


(25) 


Vm £ St , 7r m = 


and 


Vm £ S, 4> m = 


y in 7 n y in 7 n hn 

The correctness of the method can be verified by observing that, from 

= [7 T I lv\i 

T | V V ,V J 

we can obtain 7 v = 7 T A _ 1 Rx,y (I — Uyy) 1 , and, substituting it in the above equation, 

7 t A 1 • (R t ,x + Rxy (I — U(/y) 1 Uvyr) = 0. 

7r.(7 T .A _1 .i| S | X i) 


(26) 


w 1 • 

Si 

7 

1 

A 1 Rx,v ] 

7vJ 

Si 

7 

1 

7 


Q 
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We can then divide both sides of the equation by the constant 7 T • A~ x ■ l|s| x i, resulting in 7r • Q = 0. 

In [ 8 ], it was found that the solution time is often greater than with the “elimination” approach 
based on Eq. (3). This is due to the number of nonzero entries in P, normally larger than in Q, and, 
frequently, to a slower numerical convergence. However, pathological cases where P has substantially 
fewer entries than Q arise when N tangible markings can reach a small set of vanishing markings, which 
can, in turn reach M tangible markings. This “N-to-M switch” behavior corresponds to 0(N + M) 
arcs in P and 0(N ■ M) in Q, hence it affects the elimination approach negatively both in terms of 
storage and execution time, although the number of iterations in the numerical solution might still be 
smaller with elimination. The use of preservation results in the following analog of Theorem 3.1. 


Theorem 5.1 Under the same conditions of Theorem 3.1, the matrix 


P' = (T' • A' + r')- 1 • (T' • R' + U') 


(27) 


satisfies P^ 5 = P, as defined in Eq. (24). 

Proof: The pre-multiplication of A' and R' by T' eliminates the effect of timed transitions 
having concession in vanishing markings. However, if we focus on the reachable states, the 
statement of the theorem is equivalent to saying that 


P 


1 

hi 

<0 

hi 

1 

1 

<0 

hi 

L r's v ,s v 

1 

<0 


(28) 


This equality then follows from the definition of P and the meaning of R' and U 7 already 
established in Theorem 3.1. Eq. 28 is, of course, the expression used in practice for a 
numerical solution. □ 


The efficiency of a solution based on Eq. (28) is improved by exploiting the existence of local 
transitions (Lemma 4.1): 


-1 


£ w* (g) A l,J + 0 A 1 • £ w* • (g) + 0 R 


P' eV / S T ,S 


V Pi^V / St , S T Vj eX ' P^ V 

E - <8> r '-‘ + ® r ) -|E >■>; - <E> W ‘ J + © u ‘ 


yqe x* p.er 


/ s v ,s v Vb eJ * Pie? 


P.er 


s v ,s J 


(29) 


since this expression for P reduces the number of Kronecker products to be performed at each iteration. 


5.2 Using partial elimination to improve solution efficiency 

A disadvantage of the approach just described is that the size of the probability vector 7 for the DTMC 
is now |<S|, considerably larger than |<Sx| in many practical models. Analogously, the size of the matrices 
for place i is given by the projection of S onto its z-th component, S l = {/ : 3m £ <S, irq = /}, regardless 
of whether markings satisfying irq = / are vanishing or tangible. 
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It was already observed in [8] that it is possible to eliminate a subset of the vanishing markings, 
preserving only those involved in large switches, in the hope of achieving the best memory-execution 
tradeoff. We can exploit the same idea in our approach, but for a different purpose. Partition the local 
state space for place z, S\ into 

• Sj = {/ : Vm G St, Hi; = /}, the set of local tangible markings. 

• Sg = {/ : 3m G <Sy,m 8 = / A Z* fl £(m) ^ 0}, the set of possibly synchronized local vanishing 
markings. 

• S l L = {/ : Vm G <Sy,m 8 = ! =? Z* C I £(m) = 0}, the set of non-synchronized local vanishing 
markings. 

Any immediate synchronizing transition can be enabled only in markings having components in Sj U1S5, 
that is, in S s = S v n ((<$£ U S\) x • • • x (<sf' U ' ')) • 

We now define a “partially eliminated” (or “partially preserved”) DTMC with transition probability 
matrix P and state space Sk = St U Ss, (K stands for ‘keep”) which can be used to compute 7 r and </> 
more efficiently than from P. Partition the matrices W ,J , R®, U*, A hJ , X® J , A\ and X®, according to 
the sets S l K = Sj U Sg and S l L . For example, 



Then, assuming that the rows Un/v|U^ L J are already normalized (this can be easily enforced since 
each U® is built before starting the overall solution), define the matrices 

= wjy- + w;y • (1 - uy ,)' 1 • uk, A -, 

R 1 = Rk- A - + Rk-x • (I - UU)‘ l • Ui A -,and 
U' = ukjr + Uk-x • (I - «*«,)'' • Ukjf- 

Given this definition, the blocks for the rows and columns of Sk in zl® J , X® J , A\ and X® still contain 
the correct row sums for the corresponding matrices. Then, we can state our final theorem, which 
allows the efficient solution of a structured GSPN with immediate synchronizing transitions. 

Theorem 5.2 Under the same conditions of Theorem 3.1, define the transition probability 
matrix 


(30) 



a k,i< + ® A K,l 


z^ w j yy t <07 ji -k,k 

t 3 ex* Pt ev p t ev 


\ 3 


t 3 eX* p t ev 


r® J -i- ZDi r® 

1 K,K “T Vt7 1 K,1 


z^ yy * k,k t k ,k 

tjei* p t ev Pt ev 


W® J + 0 U' 


tj e J* p t er 
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and solve for 7 • P = 7 subject to the normalization 7 • 1 x 1 = 1 - Then, the steady-state 
probability and the rate of entering the markings in Sk are 


Vm G St , 


7 T m — 


7r 


' h r 


Sne5 T "Yii ■ hr 


Vm G Sk-, 4 * m — 


7r 


Sne<S T 7n ' hr 


Proof: We only need to show that P correctly describes the DTMC obtained when embed- 
ding the GSPN at the times when markings in Sj U Ss, but not those in Sl = S \ (St U Ss), 
are entered. In other words, 7 should differ from 7^ only by a multiplicative constant, 
where [y K | 7 L ] = 7 is the solution to Eq. 25 , which, in block form, is 


[1k\1l\ 


P K,K 

P K,L 

P L,K 

P L,L 


[1k\1l\- 


We can then obtain ~f L = • P k,l • (I — Pyn) 1 and, by substitution, 

Ik ' (Px,l • (I — P z,,l) 1 • P l,k + P/y/v) = Ik- 


Then, it is sufficient to show that (Pk,l • (I — P l,l) 1 • P l,k + P/y/v) and P coincide. For 
any m, n G Sk, (P/y/v) m n re P resen t s the probability of going from marking m G Sk to 
marking n G Sk in a single firing, while (Pk,l • (I — P/y/v) _1 ' Pi,A') m n represents the 
probability of going from m to any marking m 1 G Sl, visiting any number of markings in 
Sl, and finally leaving Sl from some marking m 2 (possibly the same as m 1 ) to reach n in 
one firing. 

Assuming that m G St and n differs from m in at most the position for place pi, the 
corresponding entry P m ,n is 


-1 


Pm,n — X! W j 0 ^K,K + O ^K,! 


y w r 0 + 0 R 


v yen’* Pt ev 


P,ev 


m,m 


P,ev 


P,ev 


m,n 


= A 


_1 • y w* • IT w ,,J + r' 

in. in \ / ; ^ j 11 TT m*,n* ' iii/.ii 

ex* Pl ev 


( 


= A 


-1 

m,m 


t,eX' 


( 

P,er y 


mje s i L 


m?E s i L 


I>yn wy in ,+ Y. w h,m. E (i-uy)„j m ,uy, ni 


+ 


m \es l L 


m. 2 ies l L 


rl. n + y rl m i y fi-uy ) 1 u* 2 

men; 1 \ L ’ L J m.j ,m 2 m i ’ 


,n i 


(if n and m differ in more than one position, the cause must be the firing of a synchronizing 
transition, so the “local term” for place pi in the last expression is absent). In any case, 
Ij^m is just a normalization factor (if m G Ss , Phim wou ld be used instead), so the 
expression indicates the required probability. The key issue is that the order of firing of 
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Figure 5: A GSPN with an immediate synchronizing transition. 

local immediate transitions does not affect the probability of reaching a given n £ Sk, since 
their weights and disabling are decided locally: the events “going from m- to nff for each 
Pi are independent, so their product correctly describes the overall probability of going from 
m 1 to n. □ 

We stress that our approach might not eliminate every vanishing marking enabling only local inime 












diate transitions. For example, consider the GSPN in Fig. 5(a). We have = {(0), (2)}, Sg = {(1)}, 
Si = 0, Si = {(0)}, Sg = {(1)}, and Si = {(2)}, as shown in Fig. 5(b). After eliminating the local 
markings Si from the second sub-GSPN, we obtain the “local reachability graphs” in Fig. 5(c). Fig. 
5(d), 5(e), and 5(f) show the graphs describing P', P, and P, respectively. The dotted arcs in Fig. 5(d) 
correspond to timed transitions with concession in vanishing markings, which lead to markings in S 1 . 
These are not present in Ph Hence, (global) marking (1,2) is unreachable and is absent in P and P. 
Furthermore, markings (0,2) and (2,2) are absent from P, because their second component, 2, enables 
only local immediate transitions in the second sub-GSPN, (2) ^ Sg. However, marking (1,0) is still 
present in P, even if it enables only G, a local immediate transition. This is because its first compo- 
nent, 1 , corresponds to having a token in p 3 , which is a condition for the enabling of the synchronizing 
immediate transition G- In other words, we cannot eliminate the local marking ( 1 ) from the local state 
space for the first sub-GSPN, because this would eliminate both global markings (1,0) and (1, 1), and 
eliminating ( 1 , 1 ) would make it impossible to capture the effect of synchronizing transition G in the 
Kronecker products of Eq. (30). 

6 Numerical solution 

Applying Theorem 4.1, we use the generator matrix Q = St to compute the stationary distribution 
7 r G 1R} St \ of the CTMC underlying the GSPN according to Eq. (3). We use an approach based 
on Kronecker algebra to avoid storing Q explicitly, so only iterative methods which do not require the 
modification of Q itself can be used effectively. Adopting the Jacobi method with overrelaxation (JOR), 
we transform Q into the iteration matrix M = (1 — cu) • I + cu • R • diag(h) and solve the eigenvector 
problem: 

7 r • M = 7 T subject to 7 r • l|s T | xl = 1. 

Successive approximations of 7 r are obtained iteratively as 

7r [m+1] <- tt h • M, (31) 

starting from an initial probability vector 7 r^ satisfying 7 r^ > 0 and 7 T^ 1 | 5 t | x1 = 1. If the CTMC is 
ergodic and if the iterations converge, JOR is guaranteed to result in the correct solution, regardless 
of the value of 7 r^, that is, 7 T = linv^oo 7 r^, if this limit exists. We do not discuss the choice of the 
relaxation parameter uj, 0 < uj < 2, which affects the convergence rate [16]; for a detailed analysis of 
numerical techniques for the solution of Markov chains, see [15]. 

We then need to multiply a full vector, 7r^, by a matrix, R, which is described as a (submatrix of 
a) Kronecker expression of smaller matrices, 

R=fEK'0^+© ff ) > 

\ tj ex- P ,ev Pl ev / g T: g T 

where R* = R* • X* and W* J = W' ,J -X*. A similar discussion applies when using Corollary 4.1 (to 
obtain </> m for m £ Sy after 7 r has been obtained, we compute 7 r • 5 ) or Theorem 5.2 (to obtain 

we perform the product 7 ^ • P). 
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First, we precompute St, IF^ 1 , an d h, for each tangible marking: 

• During the reachability set construction, St is normally stored in lexicographic order in a balanced 
search tree. However, we can assume from now on that is accessed exclusively using 'F^ 1 . 

• After the reachability graph construction, is stored as an array of size |<Sx| of pointers to the 
markings. 'F^ 1 ( k ) points the k - th marking, that is, the marking m satisfying 'F(m) = k. Several 
methods can be used to store a marking. In principle, a single integer describing the position of 
m in S' according to lexicographic order is sufficient, but, in practice, this integer might require 
more than 32 bits. 

• h g carries the same information as A. Instead of storing this vector explicitly, we could 

recompute A at each iteration, to further reduce the memory requirements. This is a memory- 
execution trade-off. 


It is important to note that an alternative method suggested by Kemper [12] for the state space ex- 
ploration has the advantage of allowing the determination of whether a marking is reachable in 0(1) 
instead of 0(log |<Sx|) operations. However, it requires a bit vector of size | 1 , which might be a problem 
when | iS' | |£x|- 


If we define 


i-i Iffi 

\/pi G V, low(i) = Jd n,[ and up(i) = n ni , 

/=! l=i+l 


the contribution of the Kronecker sum to the new iteration vector 7r[ m+1 l in Eq. (31) can be rewritten 
as 

ttH • 0 W = E <8> Q* <8> = E 4? +1] > 

p t ev Pt ev Pt ev 

while the contribution of the Kronecker products is 


E w *j • *- H • <8> 

tjex* p,ev 


E W*j- 7T H 

tjEX' 


• n ® w 8 ’- 


® I up(i) 


E 

tjEX* 


^[rn + 1] 
l 3 


We implemented the basic operation a • (I/ ou ,p) ® A ® I«p(q) in a function mult(&, A,i), where a is 
a vector of size |<Sx| and A is a rii X rii matrix. Each term 7rj,™ +1 ^ i s computed with a single call to 
mult, while each term requires \V\ successive calls to mult. During this iteration, some states 

(possibly unreachable) are passed and temporarily stored. However, if all the states of S' are reachable 
and tangible, that is, if IE = |<Sx|, there is no need to explicitly identify the reachable states. We stress 
that this definition allows any number of Kronecker products to be performed, so it does not restrict 
the possible number of sub-GSPNs involved in a synchronization. 

For the numerical computation we consider a fork/join kanban network of four sub-GSPNs as given 
in Figure 6(b). Each sub-GSPN i = 1, ... ,4, of the network is modelled by the GSPN shown in Fig. 
6(a) on the left. Multiplying R* and W* J by X* corresponds, de facto, to the automatic elimination 
of the immediate transitions t re d 0i and t 0 k t , resulting in the GSPN shown in Fig. 6(a) on the right, 
where the rates of t mire <i 0i and t mt0 k t are the the rate of t m multiplied by the firing probabilities of t re d 0i 
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(b) 



Figure 6: A sub-GSPN and a fork /join arrangement of four sub-GSPNs. 

and toki, respectively. The parameters specifying the stochastic behavior of the sub-GSPNs are the 
rates Wi nj , w m; , and Wb ac k, and the probabilities w 0 k, and w redo> ■ We assume that these quantities are 
constant, but they could depend on the local marking (or even on the global marking, in the case of 
synchronizing transitions) without affecting the feasibility or the complexity of the solution algorithms. 
Pallets enter sub-GSPN i of the kanban network through transition t ; n> , which requires the availability 
of a kanban ticket in place pkanban, ■ Then, the pallet proceeds to the machine, in place p mi . After being 
worked by t mi , a part is checked for quality and it is either transported back to p mi by tj mc k, for further 
rework, or moved out of the machine by t outi . The numerical values of the parameters for the model are: 
w mi = 1.2, w m2 = 1.4, w mo = 1.3, and w m4 = 1.1, while, for each sub-GSPN i, w ok , = 0.7, w red0t = 0.3, 
and w back, = 0.3. All transitions have single-server semantics. The input and output rates for the entire 
kanban network are set by assigning w; ni =1.0 and w outi = 0.9. The synchronizing transition Gync/i! 
corresponds to the merging of transitions {t outl ,ti no ,t; n3 } and has rate 0.4, while t sync h , corresponds 
to transitions {Gin, Gouts Gm 4 } and has rate 0.5. In the computations, we vary the number N = iV; of 
tokens initially in each place pkanban, ■ 

We consider two alternative decompositions of the model. In Case 1, each node of the kanban network 
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corresponds to a sub-GSPN, or one macroplace, in our terminology. Hence, \V\ = 4 and |7b*| = 2. In 
this case, the fork-join synchronization creates unreachable markings (any marking where the number 
of tokens in pkanban 2 and Pkanbanz differ), so that \St\ < |<S / |- Instead, in Case 2, we merge sub-GSPNs 2 
and 3 into a single macroplace, hence \V\ = 3. Now, however, the constraint on the number of tokens 
in Pkanban 2 and Pkanban 3 is taken explicitly into account when generating the local state space for the 
corresponding sub-GSPN. Thus, the overall state space becomes the exact cross-product of the three 
local state spaces, |<Sx| = | 1 , and the computation is more efficient, since there is no need to distinguish 
the reachable states from the unreachable ones. This shows how an intelligent decomposition of the 
model can greatly affect the computational requirements of the solution. 

Finally, we make the two synchronizing transitions immediate and apply Theorem 5.2 to a decom- 
position into either four or three sub-GSPNs, resulting in Case 3 and 4, respectively. We observe that 
the presence of synchronizing transitions leads to unreachable vanishing markings in this model. 


N 

Case 

ei 

e2 

63 

e 4 

T 

1 

1,2 

0.907 

0.671 

0.671 

0.355 

0.132 

2 


1.810 

1.328 

1.328 

0.764 

0.248 

3 


2.722 

1.944 

1.944 

1.154 

0.332 

4 


3.646 

2.515 

2.515 

1.510 

0.393 

5 


4.588 

3.053 

3.053 

1.876 

0.439 

1 

3,4 

0.860 

0.810 

0.810 

0.534 

0.139 

2 


1.691 

1.671 

1.671 

1.268 

0.263 

3 


2.529 

2.545 

2.545 

2.037 

0.348 

4 


3.379 

3.425 

3.425 

2.807 

0.409 

5 


3.789 

4.454 

4.454 

3.914 

0.529 


Table 1: Performance results as a function of N 

We compute the expected number e 4 - of tokens in places p mt7 pback , , and p ou t l for each sub-GSPN of 
the kanban system. For a given i = 1, ... ,4, this corresponds to a reward structure where the reward 
rate p(m) of a marking m is given by + m back, + m ou t , , and the reward impulses r are identically 
zero. We also compute the throughput r of the system, defined as the expected firing rate of Gync/ii • This 
corresponds to a reward structure where the reward rates are identically zero and the reward impulses 
are one if they correspond to the firing of Gync/ii , zero otherwise (the same quantity could be computed 
by observing ti ni , t sync h 2 , or t outi instead). We stress that, in general, reward rates and impulses could be 
marking-dependent. Table 1, shows the resulting numerical values; as expected, e 2 = e 3 . It is apparent 
that the first sub-GSPN is the most loaded for Case 1 and 2, while the second and third sub-GSPNs are 
the bottleneck in Case 3 and 4. This would suggest that, the synchronization behavior is an important 
performance factor, in addition to the actual machining rate of each station. 

Table 2 shows the size of the state spaces St, Sy, and S' as a function of N. The overall number 
of nonzero elements for the sparse matrices used by the Kronecker approach (“local”) can be compared 
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N 

Case 

<Sx 

\$v\ 

l^'l 

local 

nonzero(Q) 

nonzero(P) 

“explore” 

“solve” 

1 

1 

160 

0 

256 

20 

616 

616 

0.017 

0.267 

2 


4,600 

0 

10,000 

80 

28,120 

28,120 

0.967 

12.550 

3 


58,400 

0 

160,000 

200 

446,400 

446,400 

16.267 

309.533 

4 


454,475 

0 

1,500,625 

400 

3,979,850 

3,979,850 

190.500 

4,721.217 

5 


2,546,432 

0 

9,834,496 

700 

24,460,016 

24,460,016 

1,759.000 

22,215.257 

1 

2 

160 

0 

160 

40 

616 

616 

not req. 

0.017 

2 


4,600 

0 

4,600 

186 

28,120 

28,120 

not req. 

1.517 

3 


58,400 

0 

58,400 

678 

446,400 

446,400 

not req. 

50.483 

4 


454,475 

0 

454,475 

1878 

3,979,850 

3,979,850 

not req. 

855.917 

5 


2,546,432 

0 

2,546,432 

4368 

24,460,016 

24,460,016 

not req. 

6,054.783 

1 

3 

152 

8 

256 

20 

600 

608 

0.017 

0.167 

2 


3,816 

697 

10,000 

80 

23,832 

24,529 

0.517 

20.450 

3 


41,000 

13,656 

160,000 

200 

316,360 

330,016 

5.967 

84.600 

4 


268,475 

128,000 

1,500,625 

400 

2,343,050 

2,471,050 

75.783 

719.050 

5 


1,270,962 

769,480 

9,834,496 

700 

12,025,566 

12,795,046 

707.483 

4,111.681 

1 

4 

152 

8 

160 

40 

600 

608 

0.013 

0.150 

2 


3,816 

697 

4,600 

186 

23,832 

24,529 

0.333 

18.550 

3 


41,000 

13,656 

58,400 

678 

316,360 

330,016 

5.550 

76.250 

4 


268,475 

128,000 

454,475 

1878 

2,343,050 

2,471,050 

67.750 

647.100 

5 


1,270,962 

769,480 

2,546,432 

4368 

12,025,566 

12,795,046 

725.333 

3,562.531 


Table 2: Computational and storage requirements as a function of N. 

to that of conventional solution methods that explicitly generate Q. For comparison, we also list the 
number of nonzero elements in P, that is after eliminating the local vanishing markings. For Case 1 and 
2, this makes no difference (there are no synchronizing vanishing markings), while, for Case 3 and 4, the 
memory requirements are somewhat larger. It is clear that the only practical limitation of our approach 
is the memory required by the two iteration vectors, 7 and 7 r[ m+ 1 l, and the addressing vector Tx 
(of sizes |<Sx|), or by 7 ^, ^4 m+1 ], and vp (of sizes |<Sx| + |<Sy|), plus the vector h (of size |<Sx|), if not 
computed at each iteration. It is also apparent how the approach allows the solution of problems one 
order of magnitude larger than with conventional methods in an acceptable amount of time. 

The computation times “explore”, to explore the reachability set, and “solve”, for the Jacobi method 
with relaxation parameter u = 0.9, are given in seconds. The vector h is stored explicitly. The 
convergence criterion is set to Htt^ — 7 r [ m+1 ]|| 00 < 10 -6 (or H 7 H — 7[ m+1 ]|| 00 < 10 -6 ). The uniform 
distribution was used as the initial guess for (or 7 M). The program was run on a Sony NWS-5000 
workstation with 90 Mbyte of main memory. 

We stress that the elimination approach (Theorem 4.1) should be used whenever possible, and that 
preservation of the non-local vanishing markings (Theorem 5.2) should be used only when there are 
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immediate synchronizing transitions. Preservation of all vanishing markings (Theorem 5.1) is probably 
never appropriate, since memory usage is the paramount consideration, and this approach increases the 
already critical size of the probability vectors. In few pathological cases, it could reduce the size of the 
“local” data structures, but these are not critical. 

7 Conclusion 

We rigorously formalized and implemented an approach based on Kronecker algebra for the solution of 
the CTMC underlying a GSPN. The results extend previous works by Donatelli, Buchholz, and Kem- 
per [4, 5, 10, 11, 12], to include immediate synchronizing transitions, quite general marking-dependent 
behavior, and a reward structure allowing reward impulses associated with immediate transition brings. 
The restrictions imposed on the GSPN are minimal, thus the approach has obvious practical appli- 
cations. Furthermore, the structure of the GSPN itself gives strong hints on its decomposition. For 
example, if an “elimination-based” solution is desired, the GSPN must be decomposed so that immedi- 
ate transitions are local to a sub-GSPN; if the marking-dependency of a transition does not satisfy our 
requirements, all the places responsible for this behavior should also be merged in the same sub-GSPN. 

Memory requirements are still the main limitation to the solution, but these have now been reduced 
from the size of the transition rate matrix to that of the steady-state probability vector, for a very general 
class of GSPNs. Even for highly sparse matrices, this corresponds to the ability to solve problems whose 
state space is one order of magnitude larger than with a traditional solution approach. 

To further increase the size of models that can be solved, we can use the distributed state-space 
generation algorithm described in [7], which allows one to partition the memory and execution require- 
ments to generate the state space over a set of workstations, and which exhibits excellent speedups for 
large problems. The Jacobi method we employed can also be parallelized using a set of workstations, so 
that the entire solution process is performed in a distributed fashion and uses the available memory. In 
particular, the “local” matrices occupy a negligible amount of memory and can be duplicated on each 
processor, while only the probability vector needs to be distributed. 

Since our results are complementary to those in [7], we can reasonably hope for a two-orders of mag- 
nitude increase in the size of manageable models, assuming the availability of a network of workstations. 
As a target for the near future, we will attempt to solve CTMCs with 10 8 states and a transition rate 
matrix with 10 9 non-zeros in a matter of hours using a dozen workstations with 128Mbytes of memory 
each. 

Eventually, even CTMCs of this size might not be enough to satisfy the needs of a serious modeler. 
Our results, combined with the distributed state-space generation algorithm, are directly applicable to 
the family of approximations suggested in [17]. We can simply consider each sub-GSPN separately for 
the coarsest model, two adjacent sub-GSPNs for the next more accurate model, then three adjacent 
sub-GSPNs and so on. The comparison of the results from successive approximate models can suggest 
an estimate of the quality of the results. This allows trading off lower approximation errors against 
higher computational effort. We are currently investigating the requirements to assure an accurate 
solution (e.g., approximation errors below 5%) on a single workstation for a large class of models, where 
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the CTMCs may have more than lO 30 states. 

A prototype version of the program used to compute the results we presented is available and can 
be obtained by contacting the second author. The input to the program has a SPNP-style syntax [9]. 
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